Robust Stochastic Chemical Reaction Networks 
and Bounded Tau- Leaping 



David Soloveichik* 
dsolov@caltech . edu 



Abstract 



The behavior of some stochastic chemical reaction networks is largely unaffected by 
slight inaccuracies in reaction rates. We formalize the robustness of state probabilities 
to reaction rate deviations, and describe a formal connection between robustness and 
efficiency of simulation. Without robustness guarantees, stochastic simulation seems to 
require computational time proportional to the total number of reaction events. Even 
if the concentration (molecular count per volume) stays bounded, the number of reac- 
tion events can be linear in the duration of simulated time and total molecular count. 
We show that the behavior of robust systems can be predicted such that the compu- 
tational work scales linearly with the duration of simulated time and concentration, 
and only polylogarithmically in the total molecular count. Thus our asymptotic anal- 
ysis captures the dramatic speedup when molecular counts are large, and shows that 
for bounded concentrations the computation time is essentially invariant with molecu- 
lar count. Finally, by noticing that even robust stochastic chemical reaction networks 
are capable of embedding complex computational problems, we argue that the linear 
dependence on simulated time and concentration is likely optimal. 

1 Introduction 



The stochastic chemical reaction network (SCRN) model of chemical kinetics is used in 
chemistry, physics, and computational biology. It describes interactions involving inte 



ger number of molecules as Markov jump processes (jMcQuarrid . 119671 ; Ivan Kampenl . 119971 ; 
Erdi Tothl . Il989l ; iGillespid . Il992l ) , and is used in domains where the traditional model of 
deterministic continuous mass action kinetics is invalid due to small molecular counts. Small 
molecular counts are prevalent in biology: for example, over 80% of the genes in the E. coli 
chromosome are expressed at fewer than a hundred copies per cell, with s ome key control 
factors present in quantities under a dozen (jGuptasarmal . Il995l : iLevinl . Il999h . Indeed, exper- 
imental observations and c omputer simulations have confirmed that stochastic effects can b e 
physiologically sig nificant (jMcAdams fc Arkinl . fl997l : lElowitz et all . 120021 : ISuel et all hoOfih . 
Consequently, the stochastic model is widely employed for modeling cel lular processes (e.g. , 
Arkin et al.1 jSl)) and is included in numerous software packages (jVasudeva fc Bhallal . 
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20041 : iKierzekL 120021 : lAdalsteinsson et all hooj ) Q The stochastic model becomes equivalent 
to the cla ssical law of mass action when the molecular counts of all participating species 
are large (iKurtzl . Il972l : lEthier & Kurtz] . Il986h . 

Gillespi e's stochastic si mulation algorithm (SSA) can be used to model the behavior 
of SCRNs (|Gillespiel . Il977h . However, simulation of systems of interest often requires an 
unfeasible amount of computational time. Some work has focused on optimizing simulation 
of large SCRNs — many different species and reaction channels. For example, certain tricks 
can improve the speed of dec i ding which reaction occurs next if there are many possible 
choices (e.g., Gibson <fe Bruck ( 2000l )). However, for the purposes of this paper we suppose 
that the number of species and reactions is relatively small, and that it is fundamentally 
the number of reaction occurrences in a given interval of time that presents the difficulty 
Because SSA simulates every single reaction event, simulation is slow when the number of 
reaction events is large. 

On the face of it, simulation should be possible without explicitly modeling every reac- 
tion occurrence. In the mass action limit, fast simulation is achieved using numerical ODE 
solvers. The complexity of the simulation does not scale at all with the actual number of 
reaction occurrences but with overall simulation time and the concentration of the species. 
If the volume gets larger without a significant increase in concentration, mass action ODE 
solvers achieve a profound difference in computation time compared to SSA0 Moreover 
maximum concentration is essentially always bounded, because the model is only valid for 
solutions dilute enough to be well mixed, and ultimately because of the finite density of 
matter. However, mass action simulation can only be applied if molecular counts of all the 
species are large. Even one species that maintains a low molecular count and interacts with 
other species prevents the use of mass action ODE solvers. 

Another reason why it seems that it should be possible to simulate stochastic chemical 
systems quickly, is that for many systems the behavior of interest does not depend crucially 
upon details of events. For example bioche mical networks tend to be robust to variations 
in concentrations and kinetic parameters ( Morohashi et al. . 2002 : Alon . 20071 ). If these 



systems are robust to many kinds of perturbations, including sloppiness in simulation, can 
we take advantage of this to speed up simulation? For example, can we approach the 
speed of ODEs b ut allow molecul a r counts of some s pecies t o be small ? Inde ed, ta u-leaping 



lgorithms (e.g., Gillespie ( 200ll ): Rathinam et al. ( 20031 ): Cao et al. ( 20061 ). see Gillespie 



( 20071 ) for a review) are based on the idea that if we allow reaction propensities to remain 
constant for some amount of time r, but therefore deviate slightly from their correct values, 
we don't have to explicitly simulate every reaction that occurs in this period of time (and 
can thus "leap" by amount of time r) . 

In this paper we formally define robustness of the probability that the system is in a 
certain state at a certain time to perturbations in reaction propensities. We also provide a 
method for proving that certain simple systems are robust. We then describe a new approx- 



*Some 
bench: 
stirator: 
BioNetS: 



stochastic simulation implementations on the web: Systems Biology Work- 
http://sbw.sourceforge.net; BioSpice: http://biospice.lbl.gov; Stocha- 

http://opnsrcbio.molsci.org STOCKS: |http: // www, sysbio .pi/ stocks 

http : //x . amath . unc . edu : 16080/BioNetS SimBiology package for MATLAB: 



http: //www.mathworks . com/product s/simbiology/index . html 

' As an illustrative example, a prokaryotic cell and a eukaryotic cell may have similar concentrations of 
proteins but vastly different volumes. 
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imate stochastic simulation algorithm called bounded tau- leaping (BTL), which naturally 
follows from our definition of robustness, and provably provides correct answers for robust 
systems. In contrast to Gillespie's and others' versions of tau-leaping, in each step of our 
algorithm the leap time, rather than being a function of the current state, is a random 
variable. This algorithm naturally avoids some pitfalls of tau-leaping: the concentrations 
cannot become negative, and the algorithm scales to SSA when necessary, in a way that 
there is always at least one reaction per leap. However, in the cases when there are "oppos- 
ing reactions" (canceling or partially cancelling each othe r) other forms of tau-leaping may 
be significantly faster (e.g.. iRathinam fc El Sama 1 (!2007h ). 



B TL seems more amena ble to theoretical analysis than Gillespie's versions (jGillespiel . 



200ll . 120031 : iGao et all l200fil ). and may thus act as a stand-in for approximate simulation 



algorithms in analytic investigations. In this paper we use the language and tools of com- 
putational complexity theory to formally study how the number of leaps that BTL takes 
varies with the maximum molecular count to, time span of the simulation t, and volume 
V. In line with the basic computational complexity paradigm, our analysis is asymptotic 
and worst-case. "Asymptotic" means that we do not evaluate the exact number of leaps 
but rather look at the functional form of the dependence of their number on to, t, and V. 
This is easier to derive and allows for making fundamental distinctions (e.g., an exponential 
function is fundamentally larger than a polynomial function) without getting lost in the 
details. "Worst-case" means that we will not study the behavior of our algorithm on any 
particular chemical system but rather upper-bound the number of leaps our algorithm takes 
independent of the chemical system. This will allow us to know that no matter what the 
system we are trying to simulate, it will not be worse than our bound. 

In this computational complexity paradigm, we show that indeed robustness helps. We 
prove an upper bound on the number of steps our algorithm takes that is logarithmic in to, 
and linear in t and total concentration C = m/V. This can be contrasted with the exact 
SSA algorithm which, in the worst case, takes a number of steps that is linear in m, i, 
and C. Since a logarithmic dependence is much smaller than a linear one, BTL is provably 
"closer" to the speed of ODE solvers for mass action systems which have no dependence on 
toQ 

Finally we ask whether it is possible to improve upon BTL for robust systems, or did 
we exhaust the speed gains that can be obtained due to robustness? In the last section 
of the paper we connect this question to a conjecture in computer science that is believed 
to be true. With this conjecture we prove that there are robust systems whose behavior 
cannot be predicted in fewer computational steps than the number of leaps that BTL makes, 
ignoring multiplicative constant factors and powers of log m. We believe other versions of 
tau-leaping have similar worst-case complexities as our algorithm, but proving equivalent 
results for them remains open. 



* Indeed, the total molecular count m can be extremely large compared to its logarithm — e.g., Avo- 
gadro's number = 6 x 10 23 while its log 2 is only 79. 
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2 Model and Definitions 



A Stochastic Chemical Reaction Network (SCRN) S specifies a set of N species Si (i G 
{1, . . . , N}) and M reactions Rj (j G {1,...,M}). The siaie of S is a vector x G 
indicating the integral molecular counts of the speciesQ A reaction Rj specifies a reactants' 
stoichiometry vector fj G N , a products' stoichiometry vector jij G N , and a real- valued 
rate constant fej > 0. We describe reaction stoichiometry using a standard chemical "arrow" 
notation; for example, if there are three species, the reaction Rj-. Si + S2 — > Si + 2S3 has 
reactants vector fj = (—1, —1,0) and products vector pj = (1, 0, 2). A reaction Rj is possible 
in state 2? if there are enough reactant molecules: (Vi) Xj — > 0. Then if reaction i?j 
occurs (or "fires") in state x, the state changes to x + i/j, where Vj G "L N is the state change 
vector for reaction Rj defined as Vj = pj — fj. We follow Gillespie and others and allow 
unary (Si — > . . .) and bimolecular (2Sj — > . . . or Si + SV — > ■ ■ ■, i ^ i') react ions only. 



Sometimes the model is extended to higher-order reactions (Ivan Kampenl . 119971 ). but the 
merit of this is a matter of some controversy. 

Let us fix an SCRN S. Given a starting state x~q and a fixed volume V, we can define a 
continuous-time Markov process we call an SSA proces C of S according to the following 
stochastic kinetics. Given a current state x, the propensity function a,j of reaction Rj 
is defined so that aj{x)dt is the probability that one Rj reaction will occur in the next 
infinitesimal time interval [t,t + dt). If Rj is a unimolecular reaction Si — > ... then the 
propensity is proportional to the number of molecules of Si currently present since each is 
equally likely to react in the next time instant; specifically, aAx) = kjXi for reaction rate 
constant kj. If Rj is a bimolecular reaction Si + SV — > . . ., where i 7^ i', then the reaction 
propensity is proportional to XiXy , which is the number of ways of choosing a molecule of 
Si and a molecule of S^, since each pair is equally likely to react in the next time instant. 
Further, the probability that a particular pair reacts in the next time instant is inversely 
proportional to the volume, resulting in the propensity function aj(x) = kj Xz y i ' . If Rj is a 
bimolecular reaction 2Sj — ► . . . then the number of ways of choosing two molecules of Si to 
react is , and the propensity function is aj(x) = kj ■ 

Since the propensity function aj of reaction Rj is defined so that aj{x)dt is the proba- 
bility that one Rj reaction will occur in the next infinitesimal time interval [t, t + dt), state 
transitions in the SSA process are equivalently described as follows: If the system is in 
state x, no further reactions are possible if ^ aj(x) = 0. Otherwise, the time until the next 
reaction occurs is an exponential random variable with rate ^ ■ ctj (x) . The probability that 
next reaction will be a particular Rj* is ctj*(x)/ Y^j ctj(x). 

We are interested in predicting the behavior of SSA processes. While there are poten- 
tially many different questions that we could be trying to answer, for simplicity we define 
the prediction problem as follows. Given an SSA process C, a time t, a state x, and 5 > 0, 
prediclH whether C is in x at time t, such that the probability that the prediction is incor- 



*N = {0, 1,2, . . .} and Z = {. . . , -1, 0, 1, . . .}. 

^It is exactly th e stochastic process simulated by Gillespie's Stochastic Simulation Algorithm 
(SSA) (|Gillespid . fl977h . 



*We phrase the prediction problem in terms appropriate for a simulation algorithm. An alternative 
formulation would be the problem of estimating the probability that the SSA process is in x at time t. To 
be able to solve this problem using a simulation algorithm we can at most require that with probability at 
least <5i the estimate is within <?2 of the true probability for some constants 81,82 > 0. This can be attained 
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rect is at most 5. In other words we are interested in algorithmically generating values of a 
Bernoulli random variable I(x, t) such that the probability that I(x, t) = 1 when C is not 
in x at time t plus the probability that I(x, t) = when C is in x at time t is at most 5. We 
assume 5 is some small positive constant. We can easily extend the prediction problem to a 
set of states T rather than a single target state x by asking to predict whether the process 
is in any of the states in T at time t. Since T is meant to capture some qualitative feature 
of the SSA process that is of interest to us, it is called an outcome. 

By decreasing the volume V (which speeds up all bimolecular reactions), increasing t, or 
allowing for more molecules (up to some bound m) we are increasing the number of reaction 
occurrences that we may need to consider. Thus for a fixed SCRN, one can try to upper 
bound the computational complexity of the prediction problem as a function of V, t, and m. 
Given a molecular count bound m, we define the bounded- count prediction problem as before, 
but allowing an arbitrary answer if the molecular count exceeds m within time t. Suppose 
V is a bounded-count prediction problem with molecular count bound m, error bound 8, 
about time t and an SSA process in which the volume is V. We then say V is a (m, t, C, 5)- 
prediction problem where C = m/V is a bound on the maximum concentration]^ Fixing 
some small 5, we study how the computational complexity of solving (to, t, C, <5)-prediction 
problems may scale with increasing m, t, and C. If the (m,t,C, 5) -prediction problem is 
regarding an outcome T consisting of multiple states, we require the problem of deciding 
whether a particular state is in T to be easily solvable. Specifically we require it to be 
solvable in time at most poly logarithmic in m, which is true for any natural problem. 

It has been observed that permitting propensities to deviate slightly from their correct 
values, allows for much faster simulation, especially if the molecular counts of some species 
are large. This idea forms the basis of approximate stochastic simulation algorithms such 



as tau-leaping ([Gillespie! . l200ll ). As opposed to the exact SSA process described above, 
consider letting the propensity function vary stochastically. Specifically, we define new 
propensity functions a'-{x,t) = £j(t)a,j(x) where {£j(t)} are random variables indexed by 
reaction and time. The value of £j(t) describes the deviation from the correct propensity 
of reaction Rj at time t, and should be close to 1. For any SSA process V we can define 
a new stochastic process called a perturbation of V through the choice of the distributions 
of {£j(t)}. Note that the new process may not be Markov, and may not possess Poisson 
transition probabilities. If there is a < p < 1 such that Vj, t, (1 — p) < £j(t) < (1 + p), 
then we call the new process a p-perturbation. There may be systems exhibiting behavior 
such that any slight inexactness in the calculation of propensities quickly gets amplified 
and results in qualitatively different behavior. However, for some processes, if p is a small 
constant, the />perturbation may be a good approximation of the SSA process. That a p- 
perturbation is bounded multiplicatively (i.e., that acts multiplicatively) corresponds 
to our intuitive notion that proportionally larger deviations are required to have an effect 
if the affected propensity is large. 

We now define our notion of robustness. Intuitively, we want the prediction problem to 
not be affected even if reaction propensities vary slightly. Formally, we say an SSA process 
C is (p,5)-robust with respect to state x at time t if for any p-deviating process C based 



by running the simulation algorithm a constant number of times. 

'Maximum concentration C is a more natural measure of complexity compared to V because similar to 
m and t, computational complexity increases as C increases. 
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on C, the probability of being in x at time t is within plus or minus 5 of the corresponding 
probability for C. This definition can be extended to an outcome T similar to the definition 
on the prediction problem. Finally we say an SSA process C is (p, 5) -robust with respect 
to a prediction problem (or bounded-count prediction problem) V if C is (p, <5)-robust with 
respect to the same state (or outcome) as specified in V, at the same time t as specified in 
V. 

For simplicity, we often use asymptotic notation. The notation 0(1) is used to denote an 
unspecified positive constant. This constant is potentially different every time the expression 
0(1) appears. 

3 Robustness Examples 

In this section we elucidate our notion of robustness by considering some examples. In gen- 
eral, the question of whether a given SSA process is (p, <5)-robust for a particular outcome 
seems a difficult one. The problem is especially hard because we have to consider every 
possible p-perturbation — thus we may not even be able to give an approximate character- 
ization of robustness by simulation with SSA. However, we can characterize the robustness 
of certain (simple) systems. 

For an SSA process or p-perturbation C, and outcome T, let F r (C,t) be the probability 
of being in T at time t. Consider the SCRN shown in Figure HJ^a). We start with 300 
molecules of Si and £3 each, and are interested in the outcome T of having at least 150 
molecules of S4. The dashed line with circles shows F for the correct SSA process C. 
(All plots of F are estimated from 10 3 SSA runs.) The two dashed lines without circles 
show F for two "extremal" p-perturbations: C +p with constant £j(t) = 1 + p, and C~ p 
with constant £j(t) = 1 — p. What can we say about other p-perturbations, particularly 
where the £j(t) have much more complicated distributions? It turns out that for this 
SCRN and T, we can prove that any p-perturbation falls within the bounds set by the 
two extremal p-perturbations C~ p and C +p . Thus F for any p-perturbation falls within the 
dashed lines. Formally, C is monotonic with respect to T using the definition of monotonicity 
in Appendix IA.31 This is easily proven by Lemma IA.5I because every species is a reactant 
in at most one reaction. Then by Lemma |A.4|. F r (C~ p ,t) < F r (C,t) < F r (C +p ,t) for any 
p-perturbation C. 

To see how the robustness of this system can be quantified using our definition of (p, 6)- 
robustness, first consider two time points t = 4.5 and t = 6. At t = 4.5, the probability that 
the correct SSA process C has produced at least 150 molecules of S4 is slightly more than 
0.5. The corresponding probability for p-perturbations of C can be no larger than about 
0.95 and no smaller than about 0.1. Thus C is (p, (5)-robust with respect to outcome T at 
time t = 4.5 for p = 0.1 and 5 approximately 0.45, but not for smaller 5. On the other hand 
at t = 6, the dashed lines are essentially on top of each other, resulting in a tiny 5. In fact 
5 is small for all times less than approximately 3.5 or greater than approximately 5.5. 

What information did we need to be able to measure (p, <5)-robustness? Processes C~ p 
and C +p are simply C scaled in time. Thus knowing how F r (C, t) varies with t allows one to 
quantify (p, <5)-robustness at the various times; F r (C, t) can be estimated from multiple SSA 
runs of C as in Figure HJ Intuitively, C is (p, <5)-robust for small 5 at all times t when F r (C, t) 
does not change quickly with t (see Appendix IA.3|) . For systems that are not monotonic, 
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Figure 1: Examples of SCRNs exhibiting contrasting degrees of robustness. The SSA process C 
and outcome T are defined for the two systems by: (a) Rate constants: k\ = 1, k% = 0.001; start 
state: a^o = (300,0,300,0); outcome T: X4 > 150. (b) Rate constants: k\ — 0.01, k 2 = 0.01; 
start state: xo = (300,10,10); outcome T: x 2 > 160. Plots show F r (-,t) for an SSA process or 
p-perturbation estimated from 10 3 SSA runs. (Dashed line with circles) Original SSA process C. 
(Dashed lines without circles) The two extremal p-perturbations: C +p with constant £j(t) = 1 + p, 
and C~ p with constant £j(t) = 1 — p. For SCRN (b) we also plot F r (-,t) for a p-perturbation with 
constant £i(t) = 1+p, = 1 — P (triangles), or constant £i(t) = 1 — p, ^(*) =1 + p (diamonds). 
Perturbation parameter p = 0.1 throughout. 

knowing how F r (C,t) varies with time may not help with evaluating (p, <5)-robustness. 

Indeed, for a contrasting example, consider the SCRN in Figure [T^b). We start with 
300 molecules of Si, 10 molecules of S2, and 10 molecules of S3, and we are interested in 
the outcome of having at least 160 molecules of S2. Since Si is a reactant in both reactions, 
Lemma IA.5I cannot be used. In fact, the figure shows two p-perturbations (triangles and 
diamonds) that clearly escape from the boundaries set by the dashed lines. The triangles 
show F for the p-perturbation where the first reaction is maximally sped up and the second 
reaction is maximally slowed down. (Vice versa for the diamonds.) For characterization of 
the robustness of this system via (p, <5)-robustness, consider the time point t = 2.5. The 
probability of having at least 160 molecules of S2 in the correct SSA process C is around 0.5. 
However, this probability for p-perturbations of C can deviate by at least approximately 
0.4 upward and downward as seen by the two p-perturbations (triangles and diamonds). 
Thus at this time the system is not (p, <5)-robust for 5 approximately 0.4. What about 
other p-deviations? It turns out that for this particular system, the two p-perturbations 
corresponding to the triangles and diamonds bound F in the same way that C~ p and C +p 
bounded F in the first example (exercise left to the reader). Nonetheless, for general systems 
that are not monotonic it is not clear how one can find such bounding p-perturbation and 
in fact they likely would not exist. 
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Of course, there are other types of SSA process that are not like either of the above 
examples: e.g., systems that are robust at many times but not monotonic. General ways of 
evaluating robustness of such systems remains an important open problem. 

Finally, it is important to note that quantifying the robustness of SSA processes, even 
monotonic ones, seems to require computing many SSA runs. This is self-defeating when in 
practice one wants to show that the given SSA process is (p, 5)-robust in order to justify the 
use of an approximate simulation algorithm to quickly simulate it. In these cases, we have 
to consider (p, <5)-robustness a theoretical notion only. Note, however, that it may be much 
easier to show that a system is not robust by comparing the simulation runs of different 
p-perturbations, since the runs can be quickly obtained using fast approximate simulation 
algorithms such as that presented in the next section. 



4 Bounded Tau-Leaping 
4.1 The Algorithm 

We argued in the Introduction that sloppiness can allow for faster simulation. In this 
section we give a new approximate stochastic simulation algorithm called bounded tau- 
leaping (BTL) that simulates exactly a certain p-perturbation rather than the original SSA 
process. Consequently, the algorithm solves the prediction problem with allowed error 5 for 
(p, 5)-robust SSA processes. 



The algorithm is a variant of existing tau-leaping algorithms (jGillespiej . 120071 ) . However, 
while other tau-leaping algorithms have an implicit notion of robustness, BTL is formally 
compatible with our explicit definition. As we'll see below, our algorithm also has certain 
other advantages over many previous tau-leaping implementations: it naturally disallows 
negative concentrations and scales to SSA in a manner that there is always at least one 
reaction per leap. It also seems easier to analyze formally; obtaining a result similar to 
Theorem 14. II is an open question for other tau-leaping variants. 

BTL has overall form typical of tau-leaping algorithms. Rather than simulating every 
reaction occurrence explicitly as per the SSA, BTL divides the simulation into leaps which 
group multiple reaction events. The propensities of all of the reactions are assumed to be 
fixed throughout the leap. This is obviously an approximation since each reaction event af- 
fects molecular counts and therefore the propensities. However, this approximation is useful 
because simulating the system with the assumption that propensities are fixed turns out to 
be much easier. Instead of having to draw random variables for each reaction occurrence, 
the number of random variables drawn to determine how many reaction firings occurred in 
a leap is independent of the number of reaction firings. Thus we effectively "leap" over all 
of the reactions within a leap in few computational steps. If molecular counts do not change 
by much within a leap then the fixed propensities are close to their correct SSA values and 
the approximation is good. 

Our definition of a p-perturbation allows us to formally define "good" . We want to guar- 
antee that the approximate process that tau-leaping actually simulates is a p-perturbation 
of the exact SSA process. We can achieve this as follows. If x is the state on which the 
leap started, throughout the leap the simulated reaction propensities are fixed at their 
SSA propensities on x: aj{x). Then for any state y within the leap we want the cor- 
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rect SSA propensities aj(y) to satisfy the following p -perturbation constraint (0 < p < 1): 
(1 — p)aj(y) < dj(x) < (1 + p)aj(y). As soon as we reach a state y for which this constraint 
is violated, we start a new leap at y which will use simulated reaction propensities fixed at 
aj(y). This ensures that at any time in the simulation, there is some (1 — p) < £j(t) < (1+p) 
such that multiplying the correct SSA propensity of reaction Rj by £j(t) yields the propen- 
sity of Rj that the simulation algorithm is actually using. Therefore, we actually simulate a 
p-perturbation, and for (p, <5)-robust SSA processes, the algorithm can be used to provably 
solve the prediction problem with error 5. 

Can we implement this simulation quickly, and, as promised, do little computation per 
leap? Note that in order to limit the maximum propensity deviation in a leap, we need to 
make the leap duration be a random variable dependent upon the stochastic events in the 
leap. If we evaluate ctj(y) after each reaction occurrence in a leap to verify the satisfaction 
of the p-perturbation constraint, we do not save time over SSA. However, we can avoid this 
by using a stricter constraint we call the {s%j} -perturbation constraint (0 < < 1), defined 
as follows. If the leap starts in state x, reaction Rj is allowed to change the molecular count 
of species Si by at most plus or minus EijXi within a leap. Again, as soon as we reach a 
state y where this constraint is violated, we start a new leap at 

For any p, we can find a set of {s%j} bounds such that satisfying the {ejj}-perturbation 
constraint satisfies the p-perturbation constraint. In Appendix IA.1I we show that for any 

SCRN, the p-perturbation constraint is satisfied if < jfjU — \J ) ; where M is the 
number of reactions in the SCRN. 

Simulating a leap such that it satisfies the {ejjj-perturbation constraint is easy and only 
requires drawing M gamma and M — 1 binomial random variables. Suppose the leap starts 
in state x. For each reaction Rj, let bj be the number of times Rj needs to fire to cause 
a violation of the {£ij} bounds for some species. Thus bj is the smallest positive integer 
such that \bjVij | > EijXi for some Si- To determine r, the duration of the leap, we do the 
following. First we determine when each reaction Rj would occur bj times, by drawing from 
a gamma distribution with shape parameter bj and rate parameter a,j. This generates a 
time Tj for each reaction. The leap ends as soon as some reaction Rj occurs bj times; thus to 
determine the duration of the leap r we take the minimum of the Tj's. At this point we know 
that the first-violating reaction Rj* — the one with the minimum Tj* — occurred bj* times. 
But we also need to know how many times the other reactions occur. Consider any other 
reaction Rj (J ^ j*). Given that the bjth. occurrence of reaction Rj would have happened 
at time Tj had the leap not ended, we need to distribute the other bj — 1 occurrences to 
determine how many happen before time r. The number of occurrences at time r is given 
by the binomial distribution with number of trials bj{x) — 1 and success probability t/tj. 
This enables us to define BTL as shown in Figure [2j 

The algorithm is called "bounded" tau-leaping because the deviations of reaction propen- 
sities within a leap are always bounded ac cording to p. This is in contrast with other 

tau-leaping algorithms, such as Gillespie's (jCao et all boOfih . in which the deviations in 



reaction propensities are small with high probability, but not always, and in fact can get 



*An added benefit of providing {eij} bounds rather than p as a parameter to the BTL algorithm is that 
it allows flexibility on the part of the user to assign less responsibility for a violation to a reaction that is 
expected to be fast compared to a reaction that is expected to be slow. This may potentially speed up the 
simulation, while still preserving the p-perturbation constraint. 
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0. Initialize with time t = to and the system's state x = x$. 

1. With the system in state x at time t, evaluate all the propensities aj, and determine 
firing bounds bj for all possible reactions, where bj is the smallest positive integer 
such that \bjVij\ > EijXi for some Si. 

2. Generate violating times r,- ~ Gamma(6j, aj) for all possible reactions. 

3. Find the first-violating reaction and set the step size to the time of the first viola- 
tion: let j* = argmhij{Tj} and r = Tj*. 

4. Determine the number of times each possible reaction occurred in interval r: for 
j j* , rij ~ Binomial(6j — 1,t/tj); for j* , nj* = bj*. 

5. Effect the leap by replacing t <— t + r and x «— x + ^ ■ Vjtij. 

6. Record (if, i) as desired. Return to Step 1, or else end the simulation. 



Figure 2: The bounded tau-leaping (BTL) algorithm. The algorithm is given the SCRN, the initial 
state Xq, the volume V, and a set of perturbation bounds {£ij} > 0. If the state at a specific time 
tf is desired, the algorithm checks if t + t > tf in step (3), and if so uses r = tf — r, and treats 
all reactions as not first-violating in step (4). Gamma(n, A) is a gamma distribution with shape 
parameter n and rate parameter A. Binomial(n,p) is a binomial distribution with number of trials 
n and success probability p. 



arbitrarily high if the simulation is long enough. This allows BTL to satisfy our definition 
of a p-perturbation, and permits easier analysis of the behavior of the algorithm (see next 
section). 

As any algorithm exactly simulating a p-perturbation would, BTL naturally avoids 
negative concentrations. Negative counts can occur only if an impossible reaction happens 
- in some state x reaction Rj fires for which aj(x) = 0. But since in a /j-perturbation 
propensity deviations are multiplicative, in state x, a'Jx,t) = £j(t)aj(x) = and so Rj 
cannot occur. Further, no matter how small the {£ij} bounds are, there is always at least 
one reaction per leap and thus BTL cannot take more steps than SSA. 

On the negative side, in certain cases th e BTL algorithm can t ake many more leaps 



than Gillespie's tau-leaping (jGillespiel . l200ll . 1200.4 ICao et all [2OO6) and other versions. 



Consider the case where there are two fast reactions that partially undo each others' effect 
(for example the reactions may be reverses of each other). While both reactions may be 



occurr ing very rapidly, their propensities may be very similar (e.g., iRathinam &; El Samad 
(|2007h ). Gillespie's tau-leaping will attempt to leap to a point where the molecular counts 
have changed enough according to the averaged behavior of these reactions. However, our 
algorithm considers each reaction separately and leaps to the point where the first reaction 
violates the bound on the change in a species in the absence of the other reactions. Thus in 
this situation our algorithm would perform unnecessarily many leaps for the desired level 
of accuracy. 
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4.2 Upper Bound on the Number of Leaps 

Suppose we fix some SCRN of interest, and run BTL on different initial states, volumes, and 
lengths of simulated time. How does varying these parameters change the number of leaps 
taken by BTL? In this section, we prove that no matter what the SCRN is, we can upper 
bound the number of leaps as a function of the total simulated time t, the volume V, and 
the maximum total molecular count m encountered during the simulation. For simplicity 
we assume that all the £y are equal to some global s. (Alternatively, the theorem and proof 
can be easily changed to use min/max {£«•/} values where appropriate.) 

Theorem 4.1. For any SCRN S with M species, any e such that < e < 1/(12M) ; and 
any S > 0, there are constants c\,C2,C3 > such that for any bounds on time t and total 
molecular count m, for any volume V and any starting state, after c\ logm + C2 t (C + C3) 
leaps where C = m/V , either the bound on time or the bound on total molecular count will 
be exceeded with probability at least 1 — 5. 

Proof. The proof is presented in Appendix IA.21 □ 

Note that the upper bound on e implies that the algorithm is exactly simulating some 
p-perturbation (see previous section). 

Intuitively, a key idea in the proof of the theorem is that the propensity of a reaction 
decreasing a particular species is linear to the amount of that species (since the species must 
appear as a reactant). This allows us to bound the decrease of any species if a leap is short. 
Actually this implies that a short leap probably increases the amount of some species by a 
lot (some species must cause a violation — if not by a decrease it must be by an increase). 
This allows us to argue that if we have a lot of long leaps we exceed our time bound t and 
if we have a lot of short leaps we exceed our bound on total molecular count m. In fact 
because the effect of leaps is multiplicative, logarithmically many short leaps are enough to 
exceed m. 

It is informative to compare this result with exact SSA, which in the worst case takes 
0(1) mt(C + O(l)) steps, since each reaction occurrence corresponds to an SSA step and 
the maximum reaction propensity is kjm 2 /V or kjm. Since m can be very large, the speed 
improvement can be profound. 

We believ e , alth ough it remains to be proven, that other versions of tau-leaping (see 



e.g 



Gillespie! (|20071 ) for a review) achieve the same asymptotic worst case number of leaps 
as our algorithm. 

How much computation is required per each leap? Each leap involves arithmetic oper- 
ations on the molecular counts of the species, as well as drawing from a gamma and bino- 
mial distributions. Since there a re fast algorithm s for o b taining instances of gamma and 



binom ial random variables (e.g., Ahrens &: Dieter ( 19781 ): Kachitvichyanukul &: Schmeiser 



we do not expect a leap of BTL to require much more computation than other 



forms of tau-leaping, and should not be a major contributor to the total running time. 
Precise bounds are dependent on the model of computation. (In the next section we state 
reasonable asymptotic bounds on the computation time per leap for a randomized Turing 
machine implementation of BTL.) 
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5 On the Computational Complexity of the Prediction Prob- 
lem for Robust SSA Processes 



What is the computational complexity inherent in the prediction problem for robust SSA 
processes? Is simulation with BTL a good procedure for solving the problem, or are there 
faster methods not involving simulation that somehow directly compute desired end-time 
probabilities without evaluating the entire trajectory? We have shown that the number of 
leaps that BTL takes scales at most linearly with t and C. However, for some systems there 
are analytic shortcuts to determining the probability of being in T at time t. For instance 
the "expone ntial decay" SCRN consist i ng of the single reaction Si — > S2 is easily solvable 
analytically (IMalek-Mansour fc Nicolisj . [l975h . The calculation of the probability of being 



in any given state at any given time t (among other questions) can be solved in time that 
grows minimally with t and C. Indeed, it may seem possible that robust SSA processes are 
somehow behaviorally weak and that their behavior can be easily predicted. 

In order to be able to consider these questions formally, we specify our model of com- 
putation as being randomized Turing machines (see below). We say that computation time 
poly logarithmic in the maximum total molecular count m is efficient in m (in fact logm 
computation time is required to simply read in the initial state of the SSA process and 
target state of the prediction problem). In this section we prove that for any algorithm 
efficient in m solving prediction problems for robust SSA processes, there are prediction 
problems about such processes that cannot be solved faster than linear in t and C. We 
prove this result assuming a reasonable conjecture in computational complexity theory. 
Then, with certain caveats regarding implementing BTL on a Turing machine, simulation 
with BTL is optimal in t and C for solving prediction problems for robust SSA processes 
among algorithms efficient in m. 

We use the standard model of computation which captures stochastic behavior: ran- 
domized Turing machines (TM). A randomized TM is a non-deterministic Tm3 allowing 
multiple possible transitions at a point in a computati on. The actual transition taken is 



uniform over the choices. (See for example ISipserl (119971 ) for equivalent formalizations.) We 



say a given TM on a given input runs in computational time tt m if there is no set of random 
choices that makes the machine run longer. 

We want to show that for some SCRNs, there is no method of solving the prediction 
problem fast, no matter how clever we are. We also want these stochastic processes to be 
robust despite ha ying difficult predictio n problems. We use the following two ideas. First, a 



method based on lAngluin et al.1 (|2006l ) shows that predicting the output of given random- 
ized TMs can be done by solving a prediction problem for certain robust SSA processes. 
Second, an open conjecture, but one that is strongly compatible with the basic beliefs of 
computational complexity theory, bounds how quickly the output of randomized TMs can 
be determined. 

Computational complexity theory concerns measuring how the computational resources 
required to solve a given problem scale with input size n (in bits). The two most prevalent 
efficiency measures are time and space — the number of TM steps and the length of the 
TM tape required to perform the computation. We say a Boolean function f(x) is proba- 



* Arbitrary finite number of states and tapes. Without loss of generality, we can assume a binary alphabet. 



12 



bilistically computable by a TM M in time t(n) (where n = \x\) and space s(n) if M(x) runs 
in time t{n) using space at most s(n), and with probability at least 2/3 outputs f(x) A 
basic tenet of computational complexity is that allowing asymptotically more computation 
time t(n) always expands the set of problems that can be solved. Thus it is widely believed 
that for any (reasonable) t(n), there are "i(n)-hard" functions that can be probabilistically 
computed in t(n) time, but not in asymptotically smaller timeil For our argument we will 
need such a t(ra)-hard function, but one that does not require too much space. Formally we 
assume the following hierarchy conjecture: 

Conjecture 5.1 ((Probabilistic, Space-Limited) Time Hierarchy). For any a < 1, and 

polynomials t(n) and s(n) such that t{n) a and s{n) are at least linear, there are Boolean 
functions that can be probabilistically computed within time and space bounds bounds t(n) 
and s(n), but not in time 0(l)t(n) a (with unrestricted space usage). 

Intuitively, we take a Boolean function that requires t(n) time and embed it in a chemical 
system in such a way that solving the prediction problem is equivalent to probabilistically 
computing the function. The conjecture implies that we cannot solve the prediction prob- 
lem fast enough to allow us to solve the computational problem faster than t(n). Further, 
since the resulting SSA process is robust, the result lower-bounds the computational com- 
plexity of the prediction problem for robust processes. Note that we need a time hierarchy 
conjecture that restricts the space usage and talks about probabilistic computation because 
it is i mpossible to emb e d a T M computation in an SCRN such that its computation is error 
free ( Soloveichik et al. . 20081 ). and further such embedding seems to require more time as 



the space usage increases. 

The following theorem lower-bounds the computational complexity of the prediction 
problem. The bound holds even if we restrict ourselves to robust processes. It shows that 
this computational complexity is at least linear in t and C, as long as the dependence on m 
is at most polylogarithmic. It leaves the possibility that there are algorithms for solving the 
prediction problem that require computation time more than polylogarithmic in m but less 
than linear in t or C. Let the prediction problem be specified by giving the SSA process (via 
the initial state and volume), the target time t, and the target outcome T in some standard 
encoding such that whether a state belongs to T can be computed in time polylogarithmic 
in m. 

Theorem 5.1. Fix any perturbation bound p > and 5 > 0. Assuming the hierarchy 
conjecture ( Conjecture \5.1\) . there is an SCRN S such that for any prediction algorithm A 
and constants ci, C2, /?, rj, 7 > 0, there is an SSA process C of S and a (m, t, C, 1/3) -prediction 
problem V of C such that A cannot solve V in computational time c\ (logm)' 3 t v (C + C2) 7 
if rj < 1 or 7 < 1. Further, C is (p,5)-robust with respect to V. 

Proof. The proof is presented in Appendix IA.51 □ 



*Any other constant probability bounded away from 1/2 will do just as well: to achieve a larger constant 
probability of being correct, we can repeat the computation a constant number of times and take majority 
vote. 

^If we do not allow any chance o f error, t he correspond i ng statement is proven as the (deterministic) 
time hierarchy theorem (ISipserl . [l997l ). Also see IfBarakl (|2002l ) ; iFortnow Sz Santhan am (2006) for progress in 
proving the probabilistic version. 
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In Appendix IA. 61 we argue that BTL on a randomized TM runs in total computation 
time 

0(l)((log(m))°« + Z)t(C + 0(l)) 

where, in each leap, polylogarithmic time in m is required for arithmetic manipulation of 
molecular counts, and I captures the extra computation time required for the real number 
operations and drawing from the gamma and binomial distributions. Here I is potentially 
a function of to, V, t, and the bits of precision used, but assuming efficient methods for 
drawing the random variables I is likely very small compared to the total number of leaps. 
Then, assuming we can ignore errors introduced due to finite precision arithmetic and 
approximate random number generation, simulation with BTL is an asymptotically optimal 
way in t and C of solving the prediction problem for robust processes among methods 
efficient in to. 



6 Discussion 

The behavior of many stochastic chemical reaction networks does not depend crucially on 
getting the reaction propensities exactly right, prompting our definition of p-perturbations 
and (p, ^-robustness. A p-perturbation of an SSA process is a stochastic process with 
stochastic deviations of the reaction propensities from their correct SSA values. These 
deviations are multiplicative and bounded between 1 — p and 1 + p. If we are concerned 
with how likely it is that the SSA process is in a given state at a given time, then (p, 6)- 
robustness captures how far these probabilities may deviate for a p-perturbation. 

We formally showed that predicting the behavior of robust processes does not require 
simulation of every reaction event. Specifically, we described a new approximate simulation 
algorithm called bounded tau-leaping (BTL) that simulates a certain p-perturbation as 
opposed to the exact SSA process. The accuracy of the algorithm in making predictions 
about the state of the system at given times is guaranteed for (p, <5)-robust processes. We 
proved an upper bound on the number of leaps of BTL that helps explain the savings 
over SSA. The bound is a function of the desired length of simulated time i, volume V, and 
maximum molecular count encountered m. This bound scales linearly with t and C = m/V, 
but only logarithmically with to, while the total number of reactions (and therefore SSA 
steps) may scale linearly with t, C, and m. When total concentration is limited, but the 
total molecular count is large, this represents a profound improvement over SSA. Because 
the number of BTL leaps scales only logarithmically with to, BTL asymptotically nears the 
speed of mass action ODE solvers — which have no dependence on to. We also argue that 
asymptotically as a function of t and C our algorithm is optimal in as far as no algorithm 
can achieve sublinear dependence of the number of leaps on t or C. This result is proven 
based on a r easonable a ssum ption in computational complexity theory. Unlike Gillespie's 
tau-leaping ( Cao et al. . 20061 ). our algorithm seems better suited to theoretical analysis. 



Thus while we believe other versions of tau-leaping have similar worst-case running times, 
the results analogous to those we obtain for BTL remain to be proved. 

Our results can also be seen to address the following question. If concerned solely with 
a particular outcome rather than with the entire process trajectory, can one always find 
certain shortcuts to determine the probability of the outcome without performing a full 
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simulation? Since our lower bound on computation time scales linearly with t, it could 
be interpreted to mean that, except in problem-specific cases, there is no shorter route to 
predicting the outcomes of stochastic chemical processes than via simulation. This negative 
result holds even restricting to the class of robust SSA processes. 

While the notion of robustness is a useful theoretical construct, how practical is our 
definition in deciding whether a given system is suitable to approximate simulation via BTL 
or not? We prove that for the class of monotonic SSA processes, robustness is guaranteed 
at all times when in the SSA process the outcome probability is stable over an interval of 
time determined by p. However, it is not clear how this stability can be determined without 
SSA simulation. Even worse, few systems of interest are monotonic. Consequently, it is 
compelling to develop techniques to establish robustness for more general classes of systems. 
A related question is whether it is possible to con nect our notion of robustness to previo usly 
studied notions in mass action stability analysis (IHorn fc Jacksonl . Il972l : ISontaeL l2007l ). 



A Appendix 

A.l Enforcing the p-Perturbation Constraint by the {e^ j-Perturbation 
Constraint 

Recall that in Section 14.11 we introduced two constraints bounding the number of reaction 
events within a leap. If x is the state at the beginning of the leap, the p-perturbation 
constraint is satisfied if for every reaction Rj, (1 — p)aj(y) < aj(x) < (1 + p)aj(y) for any 
state y within the leap. The {e^l-perturbation constraint is satisfied if no reaction Rj 
changes the molecular count of species Si by more than plus or minus e^Xj within the leap. 
For a given p, we would like to find an appropriate {e^j-perturbation constraint to use in the 
BTL algorithm such that we satisfy the p-perturbation constraint, thereby ensuring that 
we are exactly simulating some p-perturbation. To avoid making the {ejjj-perturbation 
constraint tighter than necessary requires knowledge of the exact reactions in the given 

SCRN. Nevertheless, worst-case analysis below shows that setting Eij < 4§y(l — \J 1 ^p 9 ) 
works for any SCRN of M reactions. 

If £y = e then, for any SCRN with M reactions, the maximum change of any species 
Si within a leap allowed by the {eyj-perturbation constraint is plus or minus Mexi. We 
want to find an e > such that if the changes to all species stay within the Me bounds, 
then no reaction violates the p-perturbation constraint. Consider a bimolecular reaction Rj: 
2Si — > . . . first. The algorithm simulates its propensity 1)/V throughout 

the leap. If xi = or 1, then a,j(x) = 0, and as long as Me < 1, then still a,j(y) = 0, satisfying 
the p-perturbation constraint for Rj. Otherwise, if X{ > 2, then the SSA propensity at 
state y within the leap is aj(y) = kjyi(yi — 1)/V < kj(l + Me)xi((l + Me)xi — 1)/V, 
and so the left half of the p-perturbation constraint (1 — p)aj(y) < ctj(x) is satisfied if 
(1 — p){l + Me)xi((l + Me)xi — 1) < Xi(x{ — 1). Similarly, the right half of the ^-perturbation 
constraint dj(x) < (1 + p)aj(y) is satisfied if (l + p)(l — Me)xi((l — Ms)xi — 1) > Xi(xi-l). 

These inequalities are satisfied for xi > 2 when e < 4§j(l — yj 1 ) (which also ensures 
that Me < 1). 

In a likewise manner, for a unimolecular reaction Rj: Si — > . . ., the /^-perturbation 
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constraint is satisfied if (1 — p)(l + Me)xi < Xi and (1 + p)(l — Me)xi > X{, and for a 
bimolecular reaction Rj\ S^ + SV the constraint is satisfied if (1 — p)(l + Me) 2 XiXi> < 

XiXit and (1 + p)(l — Me) 2 XiXi> > XiXy. It is easy to see that setting e as above also satisfies 
the inequalities for these reaction types. 

Throughout the paper we assume that p, e or {Eij } are fixed and most of our asymptotic 
results do not show dependence on these parameters. Nonetheless, we can show that for a 
fixed SCRN and for small enough p, e can be within the range 0(l)p < e < 0(l)p and thus 
scales linearly with p. Therefore, in asymptotic results, the dependence on e and p can be 
interchanged. Specifically, the e dependence explored in Appendix IA.2I can be equally well 
expressed dependence on p. 

A. 2 Proof of Theorem I4.lt Upper Bound on the Number of Leaps 

In this section we prove Theorem 14.11 from the text, which upper-bounds the number of 
leaps BTL takes as a function of m, t, and C: 

Theorem. For any SCRN S with M species, any e such that < e < 1/(12M), and 
any 5 > 0, there are constants ci,C2,C3 > such that for any bounds on time t and total 
molecular count m, for any volume V and any starting state, after c\ logm + C2 t (C + C3) 
leaps where C = m/V , either the bound on time or the bound on total molecular count will 
be exceeded with probability at least 1 — 5. 

We prove a more detailed bound than stated in the theorem above which explicitly shows 
the dependence on e hidden in the constants. Also since we introduce the asymptotic results 
only the end of the argument, the interested reader may easily investigate the dependence 
of the constants on other parameters of the SCRN such as N, M, Uij, and kj. We also show 
an approach to probability 1 that occurs exponentially fast as the bound increases: if the 
bound above evaluates to n, then the probability that the algorithm does not exceed m or 
t in n leaps is at most 2e~°^ n . 

Our argument starts with a couple of lemmas. Looking within a single leap, the first 
lemma bounds the decrease in the molecular count of a species due to a given reaction as a 
function of time. The argument is essentially that for a reaction to decrease the molecular 
count of a species, that species must be a reactant, and therefore the propensity of the 
reaction is proportional to its molecular count. Thus we see a similarity to an exponential 
decay process and use this to bound the decrease. Note that a similar result does not hold for 
the increase in the molecular count of a species, since the molecular count of the increasing 
species need not be in the propensity functionQ Then the second lemma uses the upper 
bound on how fast a species can decrease (the first lemma), together with the fact that in a 
leap some reaction must change some species by a relatively large amount, to classify leaps 
into those that either (1) take a long time or (2) increase some species significantly without 
decreasing any other species by much. Finally we show that this implies that if there are 
too many leaps we either violate the time bound or the total molecular count bound. 

*If a reaction is converting a populous species to a rare one, the rate of the increase of the rare species 
can be proportional to m times its molecular count. The rate of decrease, however, is always proportional 
to the molecular count of the decreasing species, or proportional to C times the molecular count of the 
decreasing species (as we'll see below). 
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For the following, values / and g will be free parameters to be determined later. It helps 
to think of them as < / < 9 < 1. How long does it take for a reaction to decrease Xi by 
gth. fraction of the violation bound exj? The number of occurrences of Rj to decrease Xi by 
gexi or more is at least gexi/\vij\. The following lemma bounds the time required for these 
many occurrences to happen. 

Lemma A.l. Take any f and g (0 < f,g < 1), any reaction Rj and species Si such that 
Vij < 0, any state x, and any e. Assuming that the propensity of Rj is fixed at dj(x), 
with probability at least 1 — / jg, fewer than gexij\vij\ occurrences of Rj happen in time 
fe/{\vij\kj) if Rj is unimolecular, or time fe/(\vij\kjC) if Rj is bimolecular. 

Proof. For reaction Rj to decrease the amount of Si, it must be that Si is a reactant, and 
thus Xi IS cl factor in the propensity function. Suppose Rj is unimolecular. Then dj — kj x ^ 
and the expected number of occurrences of Rj in time /. .j, is a,-/ 1 e \ , < f-rf%. The 

I "ijj I "*j I I ™j I ^ij I 

desired result then follows from Markov's inequality. If Rj is bimolecular with Si 7^ Sji being 
the other reactant then aj = kj ^y/L • alternatively, a,j = kj Xt<yX ^ ^ if Rj is bimolecular with 
identical reactants. In general for bimolecular reactions aj < kjXiC. So the expected 
number of occurrences of Rj in time f-, — rr-n is ajf-, — frr^ < fr^r- The desired result 

J •> \Uij\kjC J J \Uij\kjC — J \u.ij\ 

follows as before. □ 

Let time f be the minimum over all reactions Rj and Si such that Ujj < of ^- / i\^ij\kj) 
if Rj is unimolecular, or \/{\vij\kjC) if Rj is bimolecular. We can think of f setting the 
units of time for our argument. The above lemma implies that with probability at least 
1 — f jg no reaction decreases Xi by gexi or more within time fef. The following lemma 
defines typical leaps; they are of two types: long or S^-increasing. Recall M is the number 
of reaction channels and N is the number of species. 

Lemma A. 2. (Typical leaps). For any f and g (0 < f,g < 1), and for any e, with 
probability at least 1 — NMf/g one of the following is true of a leap: 

1. (long leap) r > fef 

2. (Si-increasing leap) r < fef, and the leap increases some species Sj at least as Xi 1— > 
Xi + \exi] — gMexi, while no species <SV decreases as much as 1— > x# — gMex^ . 

Proof. By the union bound over the M reaction channels and the iV species, Lemma lA.ll 
implies that the probability that some reaction decreases the amount of some species Si 
by gexi or more in time fef is at most NMf/g. Now suppose this unlucky event does not 
happen. Then if the leap time is r < fef, no decrease is enough to cause a violation of the 
deviation bounds, and thus it must be that some reaction Rj increases some species S% by 
more than exj. (Since Rj must occur an integer number of times, it actually must increase 
Si by \exj~] or more.) Since no reaction decreases Si by gexi or more, we can be sure that 
Si increases at least by \exi\ — gMexi . □ 

Lemma A. 3. For any species Si, a leap decreases Si at most (XS Xi I ^ Xi M[exj\ - 2. 

Proof. At most M reactions may be decreasing Si. A reaction can decrease Si by as much 
as [exj\ without causing a violation of the deviation bounds. The last reaction firing that 
causes the violation of the deviation bounds ending the leap uses up at most 2 molecules of 
Si (since reactions are at most bimolecular). □ 
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N ote that a similar lemm a does not hold for Gillespie's tau-leaping algorithms ([Gillespie 



2001 . 20031 : Cao et al. . 20061 ) because the number of reaction firings in a leap can be only 



bounded probabilistically. With some small probability a leap can result in "catastrophic" 
changes to some molecular counts. Since with enough time such events are certain to occur, 
the asymptotic analysis must consider them. Consequently, asymptotic results analogous 
to those we derive in this section remain to be proved for tau-leaping algorithms other than 
BTL. 

Our goal now is to use the above two lemmas to argue that if we have a lot of leaps, we 
would either violate the molecular count bound (due to many ^-increasing leaps for the 
same Si), or violate the time bound (due to long leaps). Let n be the total number of leaps. 
By Hoeffding's inequality, with probability at least 1 — 2e~ 2n ^ NM ^ 9 ^ (i.e., exponentially 
approaching 1 with n), the total number of atypical steps is bounded as: 

[# of atypical leaps] < 2nNMf/g. (1) 

Further, in order not to violate the time bound t, the number of long steps can be bounded 
as: 

[# of long leaps] < t/(/ef). (2) 

How can we bound the number of the other leaps (^-increasing, for some species Sj)7 
Our argument will be that having too many of such leaps results in an excessive increase 
of a certain species, thus violating the bound on the total molecular count. We start by 
choosing an Si for which there is the largest number of S^-increasing steps. Since there are 
N species, there must be a species Si for which 

[# of Si-increasing leaps] > — [# of Sj'-increasing leaps]. (3) 

At this point, it helps to develop an alternative bit of notation labeling the different 
kinds of leaps with respect to the above-chosen species Si to indicate how much x% may 
change in the leap. Since our goal will be to argue that the molecular count of Si must be 
large, we would like to lower-bound the increase in Si and upper-bound the decrease. An 
atypical leap or a long leap we label "J, j" . By Lemma [A.3l these leaps decrease Si at most as 
Xi i— > Xi — M \_£Xi\ — 2. An Sj-increasing leap we label "|" • Finally, an £V-increasing leap for 
S%i 7^ Si we label "J," . By Lemma lA.2t f leaps increase Si at least as i— > xi + \exi\ —gMexi, 
while J. leaps decrease Si by less than — gMexi. 

We would like to express these operations purely in a multiplicative way so that they 
become commutative, allowing for bounding their total effect on Xi independent of the 
order in which these leaps occurred but solely as a function of the number of each type. 
Further, the multiplicative representation of the leap effects is important because we want 
to bound the number of leaps logarithmically in the maximum molecular count. Note that 
[I leaps cause a problem because of the subtractive constant term, and f leaps cause a 
problem because if X{ drops to multiplicative increases are futile. Nonetheless, for the 
sake of argument suppose we knew that throughout the simulation Xi > 3. Then assuming 
e < 1/(12M), we can bound the largest decrease due to a J. J. leap multiplicatively as Xi i— > 
(1/4) Xi. Further, we lower-bound the increase due to a | leap as Xi h- »■ (1 + (1 — gM)e)xi. 
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Then the lower bound on the final molecular count of Si and therefore the total molecular 
count is 

3(1 + (1 - gM)e) n] (1 - gMe) nl (1/4)" U < m. (4) 

This implies an upper bound on the number of | leaps, that together with (eqns. d])-© 
provides an upper bound on the total number of leaps, as we'll see below. 

However, rcj might dip below 3 (including at the start of the simulation). We can adjust 
the effective number of f leaps to compensate for these dips. We say a leap is in a dip if it 
starts at x\ < 3. Observe that the first leap in a dip starts at Xi < 3 while the leap after a 
dip starts at X{ > 3. Thus, unless we end in a dip, cutting out the leaps in the dips can only 
decrease our lower bound on the final X{. We'll make an even looser bound and modify 
simply by removing the contribution of the f leaps that are in dipsE How many j leaps 
can be in dips? First let us ensure g < 1/(3M). Then since a j leap decreases Xi by less 
than gMexi < Xj/3, and the decrease amount must be an integer, a j. leap cannot bring 
Xi below 3 starting at Xi > 3. Thus if we start at Xi > 3 a H leap must occur before we 
dip below 3. Thus the largest number of dips is + 1 (adding 1 since we may start the 
simulation below 3). Let and n d be the number of f and Jd leaps in the dth dip (we 
don't care about J. leaps in a dip since they must leave Xj unchanged). Then < 2n d + 3 

and Y,d n d < Ed 2n d l + Ed 3 < 2rjU + 3 (™ U + 1) = 5 ™ U + 3 - Therefore, the adjusted 
bound © becomes: 3(1 + (1 - 5 M)e) riT - 5nU - 3 (l - gMe) nl {l/A) nU < m. For simplicity, 
we use the weaker bound 

3(1 + (1 - gM)ef (1 - gMe) nl (l/4) 6nU+3 < m. (5) 

In order to argue that this bounds the number of f leaps, we need to make sure the 
| leaps and the J. J. leaps don't cancel out the effect of the j leaps. By inequality we 
know that < Nn^ . If we can choose g to be a small enough constant such that more 
than N [ leaps are required to cancel the effect of a f leap we would be certain the bound 
increases exponentially with n} without caring about [ leaps. Specifically, we choose a g 
small enough such that (1 + (1 — gM)e)(l — gMe) N > 1 + sjl. For example we can let 
g = — (9/10) 1//iV )Q Note that g depends only on constants N and M and is independent 
of e. The bound then becomes 3(1 + e/2) nT (l/4) 6nU + 3 . 

Thus finally we have the following system of inequalities that are satisfied with proba- 
bility exponentially approaching 1 as n — > oo: 

n = n) + + (6) 

n u < t/(fef) + 2nNMf/g (7) 

3(l + e/2) nT (l/4) 6nii+3 < m. (9) 

*We know we cannot end in a dip if the resulting bound evaluates to 3 or more. Thus technically we 
assume m > 3 for the bound to be always valid. 

t Since g < 1/(3A/), make the simplification (1 + (1 — gM)e) > (1 + 2e/3) and solve for g. The solution 
is minimized when e = 1. 
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Solving for n we obtai: 



Mog(m/3) + (12h + l)tl(fef) + Qh 

ft ^ 

(1 - 24hNMf/g) 

if (l-24hf/g) > where h = (N + l)/log(l +e/2) (also recall 5 = ^(1 - (O/IO) 1 ^)). To 
ensure this we let / < g/(48hNM). Then with probability exponentially approaching 1 as 
n increases, 

n < 21og(m/3) + 96(12/» + l)th/{gef) + 12/t. 

Asymptotically as e - > 0, m ^ oo,t ^ 00 with the system of chemical equations being 
fixed, we have g = 0(1), h < 0(l)/e, and write the above as n < 0(l)(l/e) log m + 
0(l)(l/e) 3 t/f. Recall our unit of time f was defined to be the minimum over all reactions 
Rj and species Si such that Vij < of l/(\vij\kj) if Rj is unimolecular, or l/(\vij\kjC) if 
Rj is bimolecular. No matter what C is, we can say f > 1/(0(1)C + 0(1)). Thus we can 
write the above as 

n < 0(l)(l/e) log m + 0(l)(l/eft(C + 0(1)). 

For any 5, we can find appropriate constants such that the above bound is satisfied with 
probability at least 1 — 5. 

This bound on the number of leaps has been optimized for simplicity of proof rather 
than tightness. A more sophisticated analysis can likely significantly decrease the numerical 
constants. Further, we believe the cubic dependence on 1/e in the time term is excessive J3 



A. 3 Proving Robustness by Monotonicity 

In this section we develop a technique that can be used to prove the robustness of certain SSA 
processes. We use the se results to prove the robustness of the example in Section [3] as well as 
of the construction of Angluin et al. ( 20061 ) simulating a Turing machine in Appendix IA.4I 



Since /9-perturbations are not Markovian, it is difficult to think about them. Can we 
use a property of the original SSA process that would allow us to prove robustness without 
referring to p-perturbations at all? 

Some systems have the property that every reaction can only bring the system "closer" 
to the outcome of interest (or at least "no futher"). Formally, we say an SSA process is 
monotonic for outcome T if for all reachable states x, y such that there is a reaction taking 
x to y, and for all t, the probability of reaching T within time t starting at y is at least the 
probability of reaching T within time t starting at x. Note that by this definition T must 
be absorbing. Intuitively, perturbation of propensities in monotonic systems only change 
how fast the system approaches the outcome. Indeed, we can bound the deviations in 
the outcome probability of any /3-perturbation at any time by two specific p-perturbations, 



'Logarithms are base-2. 

' The cubic dependence on 1/e in the time term is due to having to decrease the probability of an 
atypical step as e decreases. It may be possible to reduce the cubic dependence to a linear one by moving 
up the boundary between a dip and the multiplicative regime as a function of e rather than fixing it at 3. 
The goal is to replace the constant base (i/4)0(i)" u +0(i) term with a ^ _ o(l)e) 0il)nll+0(1) term. Then 
the effect of a J, J. leap would scale with e, as does the effect of an f leap. 
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which are the maximally slowed down and sped up versions of the original process. This 
implies that monotonic SSA processes are robust at all times t when the outcome probability 
does not change quickly with t, and thus slowing down or speeding up the SSA process does 
not significantly affect the probability of the outcome. 

For an SSA process or />perturbation C and set of states T, define F r (C,t) to be the 
probability of being in T at time t. For SSA process C, let C~ p be the />perturbation defined 
by the constant deviations = 1 — p. Similarly, let C +p be the p-perturbation defined 
by the constant deviations £j(t) = 1 + p. 

Lemma A. 4. If an SSA process C is monotonic for outcome T, then for any p-perturbation 
C ofC, F r (C-P,t) < F T (C,t) < F T (C+P,t). 

Proof. If an SSA process is monotonic, allowing extra "spontaneous" transitions (as long 
as they are legal according to the SSA process) cannot induce a delay in entering T. We 
can decompose a perturbation with > 1 as the SSA process combined with some 

extra probability of reaction occurrence in the next interval dt. Thus, for a perturbation 
C of a monotonic SSA process C in which £j(t) > 1, we have F r (C,t) < F T (C,t). By 
a similar argument, if C has £j(t) < 1, then F r (C,t) < F T (C,t). Now C~ p and C +p are 
themselves monotonic SSA processes (C scaled in time). Then by the above bounds, for 
any p-perturbation C of C we have F r (C~ p , t) < F r (C, t) < F T (C +P , t). □ 

Since C~ p and C +p are simply the original SSA process C scaled in time by a factor 
of 1/(1 + p) and 1/(1 — p), respectively, we can write the bound of the above lemma as 
F r (C,t/(l +p)) < F r (C,t) < F T (C,t/(l - p)). Rephrasing LemmaE! 

Corollary A.l. // an SSA process C is monotonic for outcome T then it is (p, 5)-robust with 
respect toF at timet where 5 = F 1 \C +P ,t)-F T \C~ P ,t) = F r (C,t/(l-p))-F r (C,t/(l+p)). 

For many SSA processes, it may not be obvious whether they are monotonic. We would 
like a simple "syntactic" property of the SCRN that guarantees monotonicity and can be 
easily checked. The following lemma makes it easy to prove monotonicity in some simple 
cases. 

Lemma A. 5. Let C be an SSA process and T an outcome of SCRN S. If every species is a 
reactant in at most one reaction in S, and there is a set {nj} such that outcome T occurs 
as soon as every reaction Rj has fired at least nj times, then C is monotonic with respect to 

r. 

Proof. The restriction on T allows us phrase everything in terms of counting reaction oc- 
currences. For every reaction Rj, define Fj(n,t) to be the probability that Rj has fired at 
least n times within time t. Now suppose we induce some reaction to fire by fiat. The only 
way this can decrease some Fj(n, t) is if it decreases the count of a reactant of Rj or makes 
it more likely that some reaction Rji (J' ^ j) will decrease the count of a reactant of Rj. 
Either possibility is avoided if the SCRN has the property that any species is a reactant in 
at most one reaction. Since F r (C,t) = Tij Fj( n j't)i this quantity cannot decrease as well, 
and monotonicity follows. □ 
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A.4 Robust Embedding of a TM in an SCRN 

Since we are trying to bound how the complexity of the prediction problem scales with 
increasing bounds on t and C but not with different SCRNs, we need a method of embedding 
a TM in an SCRN in which the SCRN is independent of the i nput length. Among such 
methods available ( Angluin et al. . 2006 ; Soloveichik et al. . 20081 ). asymptotically the most 



efficient and therefore best for our purposes is the construction of Angluin et al. This 
result is stated in the language of distributed multi-agent systems rather than molecular 
systems; the system is a well-mixed set of "agents" that randomly collide and exchange 
information. Each agent has a finite state. Agents correspond to molecules (the system 
preserves a constant molecular count m); states of agents correspond to the species, and 
interactions between agents correspond to reactions in which both molecules are potentially 
transformed. 

Now for the details of the SCRN implementation of Angluin's protocol. Suppose we 
construct an SCRN corresponding to the Angluin et al system as follows: Agent states 
correspond to species (i.e., for every agent state i there is a unique species Si). For every 
pair of species S , j 1 ,Sj 2 , {i\ < 12) we add reaction + Si 2 — > Si a + Si 4 if the population 
protocol transition function specifies (1x^2) l— * (^3>h)- Note that we allow null reactions 
of the form + Si 2 — > + Si 2 including for i\ = 12- For every reaction Rj, we'll 
use rate constant kj = 1. The sum of all reaction propensities is A = Eli^pU since every 
molecule can react with any other moleculeQ The time until next reaction is an exponential 
random variable with rate A. Note that the transition probabilities between SCRN states 
are the same as the transition probabilities between the corresponding configurations in the 
population protocol since in the SCRN every two molecules are equally likely to react next. 
Thus the SSA process is just a continuous time version of the population protocol process 
(where unit "time" expires between transitions). Therefore the SCRN can simulate a TM 
in the same way as the population protocol. 

But first we need to see how does time measured in the number of interactions correspond 
to the real-valued time in the language of SCRNs? 

Lemma A. 6. If the time between population protocol interactions is an exponential random 
variable with rate A, then for any positive constants c, ci,C2 such that ci < 1 < c%, there 
is Nq such that for all N > No, A interactions occur between time c\N/\ and C2A/A with 
probability at least 1 — N~ 



T-C 



Proof. The Chernoff bound for the left tail of a gamma random variable T with shape 
parameter A and rate A is Pr[T < t] < (jj>) N e N ~ xt for t < A/A. Thus Pr[T < c 1 N/X] < 
(cie 1_Cl ) Ar . Since cie 1_Cl < 1 when c\ 7^ 1, Pr[T < c\N/X\ < N~ c for large enough N. 
An identical argument applies to the right tail Chernoff bound Pr[T > t] < (jj) n e N ~ xt for 
t > A/A. □ 

The following lemma reiterates that an arbitrary computational problem can be em- 
bedded in a chemical system, and also shows that the chemical computation is robust with 



* Just to confirm, splitting the reactions between the same species and between different species, the sum 
of the propensities is £\ +J2 i<i > = w (X* XiXi ~ ^ + 2 Ei<i< ^i x i' ) = W XiX i' ~ 

J2i X i) = 2V USm § th(3 faCt that 2 ^i<i' XiX i' = X i X i' and J2i X * X i + Hi&l X i X i' = x i X i' • 



22 



respect to the outcome of the computation. For a given TM and agent count m, let xj and 
be SCRN states corresponding to the TM halting with a and 1 output respectively. 

Lemma A. 7. Fix a perturbation bound p > 0, 5 > 0, and a randomized TM M with 
a Boolean output. There is an SCRN implementing Angluin et al's population proto- 
col, such that if M(x) halts in no more than ttm steps using no more than st m time, 
then starting with the encoding of x and using m = 0(l)2 Stm molecules, at any time 
t > t ssa = 0(l)W 4m log 4 (m)/m the SSA process is in x~j b with probability that is within 
5 of the probability that M(x) = b. Further, this SSA process is (p,5)-robust with respect to 
states x~f and x/ t at all times t > t ssa . 

The first part of the lemma states that we can embed an arbitrary TM computation in 
an SCRN, such that the TM computation is performed fast and correctly with arbitrarily 
high probability. The second part states that this method can be made arbitrarily robust 
to perturbations of rea ction propensities. The first part follows directly from the results 
of lAnglmn eTai] l|200d ). while the second part requires some additional arguments on our 



part. 

If we only wanted to prove the first part, fix any randomized TM M with a Boolean 
output and any constant 5 > 0. There is a population protocol of Angluin et al that can 
simulate the TM's computation on arbitrary inputs as follows: If on some input x, M uses 
ttm computational time and st m space, then the protocol uses m = 0(l)2 Stm agents, and 
the probability that the simulation is incorrect or takes longer than N = Q(l)^ m mlog 4 m 
interactions is at most 5/2. This is proved by using Theorem 11 of Angluin et al. (2006), 



combined with the standard way of simulating a TM by a register machine using multipli- 
cation by a constant and division by a constant with remainder. The total probability of 
the computation being incorrect or lasting more than N interactions obtained is at most 
0(l)tt m m~ c . Since for any algorithm terminating in t± m steps, 2 Str " > O(l) tt m , we can make 
sure this probability is at most 5/2 by using a large enough constant in m = 0(l)2 Stm . By 
Lemma IA.61 the probability that 0(1) N interactions take longer than 0(1)N/X time to 
occur is at most 5/2. Thus the total probability of incorrectly simulating M on x or taking 
longer than 0(1)N/X = 0(l)Vtt m log 4 (m)/m time is at most 5. The Boolean output of M 
is indicated by whether we end up in state xj or xj . (If the computation was incorrect or 
took too long we can be in neither.) This proves the first part of the lemma. 

We now sketch out the proof of how the robustness of the Angluin et al system can 
be established, completing the proof of Lemma IA.7I The whole proof requires retracing 
the argument in the original paper; here, we just outline how this retracing can be done. 
First, we convert the key lemmas of their paper to use real-valued SCRN time rather than 
the number of interactions. The consequences of the lemmas (e.g., that something happens 
before something else) are preserved and thus the lemmas can be still be used as in the 
original paper to prove the corresponding result for SCRNs. The monotonicity of the 
processes analyzed in the key lemmas can be used to argue that the overall construction is 
robust. 

We need the following consequence of Lemma IA.4I 

Corollary A. 2. If an SSA process C is monotonic for outcome T, and with probability p it 
enters T after time t\ but before time t<i, then for any p -perturbation C of C, the probability 
of entering T after time t\/(l + p) but before time ^/(l — p) is at least p. 
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Proof. Let p\ = F r (C,ti) and p 2 = F r (C,t 2 ). Using Lemma I A. 41 we know that Vi, 
F r (C,t/(l - p)) > F r (C,t). Thus, pi = F r (C,t l ) > F r (C, (1 - p)h). Similarly we obtain 
p 2 = F r {C,t 2 ) < F r (C, (l+p)t 2 ). Thus F T {C, (l+p)t 2 )-F r {C, (l-p)h) >p 2 - Pl =p. □ 



As an example let us illustrate the conversion of Lemma 2 of Angluin et al. ( 20061 ). 



The Lemma bounds the number of interactions to infect k agents in a "one-way epidemic" 
starting with a single infected agent. In the one-way epidemic, a non-infected agent becomes 
infected when it interacts with a previously infected agent. With our notation, this lemma 
states: 

Let N(k) be the number of interactions before a one-way epidemic starting with 
a single infected agent infects k agents. Then for any fixed e > and c > 0, 
there exist positive constants c\ and c 2 such that for sufficiently large total agent 
count m and any k > m £ , cimlnk < N(k) < c 2 mlnk with probability at least 
1 — mT c . 

For any m and k we consider the corresponding SSA process C and outcome T in which at 
least k agents are infected. Since the bounds on N(k) scale at least linearly with m, we can 
use Lemma I A. 61 to obtain: 

Let t(k) be the time before a one-way epidemic starting with a single infected 
agent infects k agents. Then for any fixed e > and c > 0, there exist positive 
constants c\ and c 2 such that for sufficiently large total agent count m and any 
k > m £ , c\mln(k)/X < t(k) < c 2 mln(k) / X with probability at least 1 — m~ c . 

Finally consider the SSA process of the one-way epidemic spreading. The possible reactions 
either do nothing (reactants are either both infected or both non- infected), or a new agent 
becomes infected. It is clear that for any m and k, C is monotonic with respect to outcome 
r in which at least k agents are infected. This allows us to use Corollary IA.2I to obtain: 

Fix any p > 0, and let t(k) be the time before a one-way epidemic starting with 
a single infected agent infects k agents in some corresponding p-perturbation. 
Then for any fixed e > 0, c > 0, there exist positive constants c\ and c 2 such that 
for sufficiently large total agent count m and any k > m £ , cim\n{k) / p)) < 
t(k) < c 2 mln(k) / — p)) with probability at least 1 — m~ c . 

Since p is a constant, what we have effectively done is convert the result in terms of inter- 
actions to a result in terms of real- valued time that is robust to p-perturbations simply by 
dividing by A and using different multiplicative constants. 

The same process can be followed for the key lemmas of Angluin et al (Lemma 3 through 
Lemma 8). This allows us to prove a robust version of Theorem 11 of Angluin et al by 
retracing the argument of their paper using the converted lemmas and the real- valued notion 
of time throughout. Since the only way that time is used is to argue that something occurs 
before something else, the new notion of time, obtained by dividing by A with different 
constants, can always be used in place of the number of interactions. 
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A. 5 Proof of Theorem I5.lt Lower Bound on the Computational Com- 
plexity of the Prediction Problem 

In this section we prove Theorem 15.11 from the text which lower-bounds the computational 
complexity of the prediction problem as a function of m, t, and C. The bound holds even for 
arbitrarily robust SSA processes. The theorem shows that this computational complexity 
is at least linear in t and C, as long as the dependence on m is at most polylogarithmic. 
The result is a consequence of the robust embedding of a TM in an SCRN (Lemma IA.7[) . 

Let the prediction problem be specified by giving the SSA process (via the initial state 
and volume), the target time t, and the target outcome T in some standard encoding such 
that whether a state belongs to T can be computed in time polylogarithmic in m. 

Theorem. Fix any perturbation bound p > and 5 > 0. Assuming the hierarchy conjecture 
( Conjecture \5.1\) . there is an SCRN S such that for any prediction algorithm A and constants 
cx,C2,(3, T/,7 > 0, there is an SSA process C of S and a (m,t,C, 1/3) -prediction problem V 
of C such that A cannot solve V in computational time c\ (logm)^t v (C + C2) 7 if t] < 1 or 
7 < 1. Further, C is (p,8)-robust with respect to V. 

Suppose someone claims that for any fixed SCRN, they can produce an algorithm for 
solving (m, t, C, l/3)-prediction problems for SSA processes of this SCRN assuming the SSA 
process is (p, (5)-robust with respect to the prediction problem for some fixed p and d, and 
further they claim the algorithm runs in computation time at most 

0{l){\og{m)ft^{C + 0{l)y (10) 

for some rj < 1 (/?, 7 > 0). We argue that assuming the hierarchy conjecture is true, such a 
value of rj is impossible. 

To achieve a contradiction of the hierarchy conjecture, consider any function probabilis- 
tically computable in it m (re) = 0(l)n^ time and st m (n) = 0(l)n space for ( = + 1. 
Construct a randomized TM having error at most 1/24 by running the original random- 
ized TM 0(1) times and taking the majority vote. Use Lemma IA.7I to encode the TM 
probabilistically computing this function in a (p, <5)-robust SSA process such that the er- 
ror of the TM simulation is at most 1/24. Then predicting whether the process ends up 
in state xj or x~j 1 provides a probabilistic algorithm for computing this function. The 
resulting error is at most 1/24 + 1/24 + 1/3 = 5/12 < 1/2, where the first term 1/24 is 
the error of the TM, the second term 1/24 is for the additional error of the TM embed- 
ding in the SSA process, and the last term 1/3 is for the allowed error of the prediction 
problem. By repeating O(l) times and taking the majority vote, this error can be reduced 
below 1/3, thereby satisfying the definition of probabilistic computation. How long does 
this method take to evaluate the function? We use V = m so that C is a constant, result- 
ing in t ssa = 0(l)t tm (n) log 4 m = 0(l)n^ +4 since m = 0(l)2 n . Setting up the prediction 
problem by specifying the SSA process (via the initial state and volume), target final state 
and time t ssa requires O(l) logm = O(l) n timeQ Then the prediction problem is solved in 

*By the construction of lAngluin et al.1 {2006"), setting up the initial state requires letting the binary 
expansion of the molecular count of a certain species be equal the input. Since the input is given in binary 
and all molecular counts are represented in binary, this is a linear time operation. Setting up the final state 
xJ Q or x~f 1 is also linear time. Computing the target time for the prediction problem t ssa is asymptotically 
negligible. 
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computation time 0(l)(\og(m))^ts Sa = 0(l)n^ + ^ +4 ^. Thus the total computation time is 
0(l)(n^ + ^ +4 ) ?? +n) which, by our choice of £, is less than 0(l)n < > , leading to a contradiction 
of the hierarchy conjecture. 

Is 7 < 1 possible? Observe that if 7 < i] then the claimed running time of the algorithm 
solving the prediction problem (expression [TO]) with time t ssa = O ( l)Vtt m {n) log 4 (m)/m 
can be made arbitrarily small by decreasing V. This leads to contradiction of the hierarchy 
conjecture. Therefore 7 > r/ > 1. 

A. 6 On Implementing BTL on a Randomized TM 

The idealized BTL algorithm presented in Section 14.11 relies on infinite precision real- value 
arithmetic, while only finite precision floating-point arithmetic is possible on a TM. Further, 
the basic randomness generating operation available to a randomized TM is choosing one 
of a fixed number of alternatives uniformly, which forces gamma and binomial draws to be 
approximated. This complicates estimates of the computation time required per leap, and 
also requires us to ensure that we can ignore round-off errors in floating-point operations 
and tolerate approximate sampling in random number draws. 

Can we implement gamma and binomial random number generators on a randomized 
TM and how much computational time do they require? It is easy to see that arbitrary 
precision uniform [0, 1] random variates can be drawn on a randomized TM in time linear 
in precision. It is likely that approximate gamma and binomial random variables can be 
drawn using methods available in the numerical algorithms literature which uses uniform 
variate draws as the essential primitive. Since many existing methods for efficiently draw- 
ing (approximate) gamma and binomial random variables involve the rejection method, the 
computation time for these operations is likely to be an expectation. Specifically, it seems 
reasonable that drawing gamma and binomial random variables can be approximately imple- 
mented on a randomized TM such that the expected time of these operations is polynomial 
in the length of the floating-point representation of the distribution parameters and the 
resultant random quantityO 

The computational complexity of manipulating integer molecular counts on a TM is 
poly logarithmic in m. Let I be an upper bound on the expected computational time required 
for drawing the random variables and real number arithmetic; I is potentially a function 
of m, V, t, and the bits of precision used. Using Markov's inequality and Theorem 14.11 we 
can then obtain a bound on the total computation time that is true with arbitrarily high 
probability. We make the TM keep track of the total number of computational steps it has 
taker]] and cut off computation when it exceeds the expectation by some fixed factor. Then 
we obtain the following bound on the total computation time: 0(l)((\og(m)) ^ + l)t(C + 
0(1)). 

"The numerical algorithms literature, which assumes that basic floating point operations take unit 
time, describes a number of algorithms for drawing from an (a pproximate) standard gamma d i stribu - 
tion l|Ahrens fc Dieteil [l978), and from a binomial distribution (|Kachitvichvanukul fc Schmeiserl 1 19881 ). 
such that the expected number of floating-point operations does not grow as a function of distribution 
parameters (however, some restrictions on the parameters may be required). On a TM basic arithmetic 
operations take polynomial time in the length of the starting numerical values and the calculated result. 

^Compute the bound and write this many l's on a work tape, and after each computational step, count 
off one of the l's until no more are left. 
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We have three sources of error. First, since BTL simulates a p-perturbation rather than 
the original SSA process, the probability of the outcome may be off by Si, assuming the 
SSA process was (p, Si) -robust. Further, since we are using finite precision arithmetic and 
only approximate random number generation, the deviation from the correct probability 
of the outcome may increase by another 62- Finally, there is a #3 probability that the 
algorithm cuts off computation before it completes. We want to guarantee that the total 
error S1+S2+S3 < S, fulfilling the requirements of solving the (m, t, C, <5)-prediction problem. 
While Si < S as a precondition, and we can make £3 arbitrarily small, a rigorous bound on 
S2 is beyond the scope of this paperQ 
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